<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>FastaChipper.pl</title>
<link rel="stylesheet" href="jperl.css" type="text/css" />
<link rev="made" href="mailto:root@localhost" />
</head>

<body>
<table border="0" width="100%" cellspacing="0" cellpadding="3">
<tr><td class="block" valign="middle">
<big><strong><span class="block">&nbsp;FastaChipper.pl</span></strong></big>
</td></tr>
</table>

<p><a name="__index__"></a></p>
<!-- INDEX BEGIN -->

<ul>

	<li><a href="#name">NAME</a></li>
	<li><a href="#synopsis">SYNOPSIS</a></li>
	<li><a href="#description">DESCRIPTION</a></li>
	<li><a href="#arguments">ARGUMENTS</a></li>
	<li><a href="#author">AUTHOR</a></li>
</ul>
<!-- INDEX END -->

<hr />
<p>
</p>
<h1><a name="name">NAME</a></h1>
<p>FastaChipper.pl - Randomly select subsequences from a multifasta file.</p>
<p>
</p>
<hr />
<h1><a name="synopsis">SYNOPSIS</a></h1>
<pre>
   FastaChipper -i InFile.fasta -o OutFile.fasta -n NumSeqs 
                -l RetLen -m MinLen -q</pre>
<p>
</p>
<hr />
<h1><a name="description">DESCRIPTION</a></h1>
<p>Takes a sequence input file in fasta format and then
randomly selects [n] sequences of length [l].
This is an extremely slow way to do this process. Better
is to load the sequences in a database with an
incremental row number. Then select the sequence string
from the database using the row number as the selection
criteria. Alternativley, the sequence strings could be
into an array that is used to fetch sequence data. This
latter method will be memory intensive, but probably the
fastest.</p>
<p>Note that in this selection model, the total length of the
sequence database is not used in the selection proces.
In other words all sequence records have an equal
likelihood of being selected regardless of length.</p>
<p>The new pipe delimited header in the output fasta file contains: Unique Name | 
Input sequence primary ID | Original clone length | 
Start position of the substring in the source sequence | 
End position of the substring in the source sequence</p>
<p>
</p>
<hr />
<h1><a name="arguments">ARGUMENTS</a></h1>
<dl>
<dt><strong><a name="item_%2di_infile%2efasta">-i InFile.fasta</a></strong><br />
</dt>
<dd>
The path for the input mulifasta file that contains all of the 
sequences that are to be selected from.
</dd>
<p></p>
<dt><strong><a name="item_%2do_outfile%2efasta">-o OutFile.fasta</a></strong><br />
</dt>
<dd>
The path to the fasta file that will contain all of the random
sequences selected.
</dd>
<p></p>
<dt><strong><a name="item_%2dn_numseqs">-n NumSeqs</a></strong><br />
</dt>
<dd>
The number of sequences that will be created.
</dd>
<p></p>
<dt><strong><a name="item_%2dl_seqlen">-l SeqLen</a></strong><br />
</dt>
<dd>
The length of the sequences that will be created.
The default value is 700.
</dd>
<p></p>
<dt><strong><a name="item_%2dm_mininputlen">-m MinInputLen</a></strong><br />
</dt>
<dd>
The minimum length that an input sequence must be to be considered
for selection. The default value is 700.
</dd>
<p></p>
<dt><strong><a name="item_%2dq">-q</a></strong><br />
</dt>
<dd>
Flag to run the program in quiet mode.
</dd>
<p></p></dl>
<p>
</p>
<hr />
<h1><a name="author">AUTHOR</a></h1>
<p>James C. Estill &lt;JamesEstill at gmail.com&gt;</p>
<table border="0" width="100%" cellspacing="0" cellpadding="3">
<tr><td class="block" valign="middle">
<big><strong><span class="block">&nbsp;FastaChipper.pl</span></strong></big>
</td></tr>
</table>

</body>

</html>
